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Abstract 


It is well known that present day remote sensing satellite systems are often constrained 
by downlink communication bandwidth. To meet this challenge collected image data 
need to be compressed efficiently. As the collected images exhibit a high degree of spa- 
tial and spectral redundancy , in the present work exploitation of these redundancies 
are utilized to achieve high compression. 

A 3-D transform based compression technique is developed and its performance is 
compared with other reported algorithms. [28] [1] 

The compression system used comprises of two subsystems : a spectral dcorrelation 
subsystem followed by a spatial decorrelation subsystem. The spectral decorrelation 
is carried out in three stages (a), data partitioning for terrain adaptive approach only 
(b). The optimum KL transform for energy compaction of data and (c). mapping of 
the eigen planes to eigen images . The decorrelated eigen images are fed to the spa- 
tial decorrelation module which consists of embedded zerotree coder based on wavelet 
transform. Wavelet transform exploits human visual perception deficiencies and offers 
the opportunity for high compression ratios. 

Compression system has been evaluated on two typical test image sets having 4 
bands ranging from 0.45/zm to 0.8/xm and 7 bpp. These test images contain a large 
variety of natural and urban terrains. Spectral decorrelation subsystem results in two 
significant eigenbands out of four i.e. two eigen bands have negligible variance or infor- 
mation content. Spectral fidelity of the images is also preserved after reconstruction. 



IV 


The maximum loss in correlation coefficient matrix is only 7 percent. These eigen 
bands are furthur spatially decorrelated using embedded zerotree coding method. The 
reconstructed image fidelities ranges from near lossless at about 10:1 compression ratio 
i.e. 0.7 bpp to visually lossless upto 80:1 compression ratio i.e. 0.09 bpp . Compared to 
the compression ratio obtained in [24] based on KLT/DCT methods , the performance 
of the present technique is far superior. 

In this compression system, spectral and spatial modularity allows EZW or KLT 
to be replaced by any other spatial or spectral coding procedure. 

The embedded zerotree coding used , have further advantages. There is no training 
of any kind and no ensemble statistics of the images are used in any way. An interesting 
property of embedded coding is that when the encoding and decoding is terminated 
during middle of a pass or in the middle of the scanning, there are no artifacts produced 
that would indicate where the termination occur. 
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Chapter 1 


Introduction 


Remote sensing systems currently in use * and future systems are likely to be con- 
strained in terms of downlink communication bandwidth. To overcome this problem 
images obtained from satellite and other airborne multispectral collection platforms 
need to be compressed . Multispectral image (MSI) compression has evolved to meet 
this challenge. 

This multispectral imagery can be used in many applications covering areas in 
global environmental monitoring , mapping , charting, geodesy and land use planning. 
There is also the area of natural resourse management including forestry, agriculture, 
water quality monitoring and wild life habitat management. With such a diverse set of 
application comes a wide variety of system requirement. These systems require better 
spatial, spectral and radiometric resolution. The volume of data increases enormously 
with the above requirements. 
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1.1 Motivation for compression 

Need for various compression schemes arises due to the massive increase anticipated 
in the volume of the remote sensing data coupled with existing limitations on channel 
bandwidth. As an example, it is well known that the Earth Observing System carrying 
a High Resolution Image Spectrometer (HIRIS) with 192 spectral channels generate 
100 Mbp/s of data. Similarly the HRMSI (High Resolution Multispectral Imagery ) 
which has 4 multispectral bands and a pan band , with 10 mt GSD and pan band 
collecting 5 mt. GSD with radiometric precision of 9 bits per pixel for pan band and 
10 bits per pixel for the multispectral band and swath width of 40 km. generate 225 
Mbps of data. Advanced system carrying 18 spectral bands , 5 mt.GSD and 100 km. 
of swath generate 4.28 Gbps of data. 

Such dynamic increase in the volume of data transmitted has made it imperative 
to develop novel compression algorithms which can compress remote sensing imagery 
data to manageable rates. 


1.2 Types of compression schemes 

Two forms of data compression are possible. The first is lossless or reversible coding 
method. This method permits exact reconstruction of the original data from the com- 
pressed format. Examples are Huffman codes and run-length code. The reduction in 
volume of data achievable through this is limited and rarely exceeds three. 


The other type of compression technique is the lossy or irreversible compression 
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method. This method often introduces error or distortion in the reconstructed data. It 
can provide significantly higher compression ratios ranging from 8 to 30 with very little 
loss. The goal of lossy data compression is to obtain the best possible reproduction 
for a given bit rate or equivalently to minimize the rate required for a given fidelity 
. The error introduced can be minimized so that the reconstructed image is visually 
indistinguishable from the original. 

The images within a multispectral data set often exhibit high correlations between 
spectral bands in addition to the spatial redundancy present within a band. Exploita- 
tion of the redundancies present result in a large compaction of the data. 

Recent work in the area of multispectral bandwidth compression can be categorized 
into three groups. 

(1) . Vector Quantization based technique : The vector nature of registered MSI makes 
it naturally favourable for VQ . However one of the problem associated with vector 
quantization is the high encoding complexity which grows exponentially with increase 
in vector dimension. 

(2) . Predictive Techniques : With high band to band redundancy, there is an enhanced 
capability to predict the pixels in one band, given knowledge of adjacent bands. Three 
dimensional prediction structures can be constructed by weighting neighbouring pixels 
from within the same band and from adjacent bands to predict the current pixel value. 
The residual value between the predicted and actual value is then quantized and com- 
pressed. 

(3) . 3-D transform based technique : In this scheme KL transform technique is gener- 
ally used to decorrelate the spectral bands. The eigen bands are further decorrelated 
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spatially by using general 2-D transform coder such as the JPEG. Recently wavelet 
transform has been used for spatial decorrelation [10] 

In general, transform based technique outperforms other prediction and vector 
quantization based technique. Results reported in [28] shows that at bit rate greater 
than 1.5 bpp , 3-D transform based technique and other two schemes give the same 
performance. At rate less than 1.5 bpp , 3-D transform based technique performed 
better as compared to other methods. 


1.3 Objective of the Thesis 

The primary objective of the thesis is to develop and test new compression algorithms 
to obtain the best quality multispectral imagery for a given bit rate. 

The first algorithm uses 3-D transformation coding technique in which the KL 
transform is used to exploit the spectral correlation followed by wavelet transform 
with embedded zerotree coding of its coefficients to exploit the spatial redundancy of 
the multispectral image set. 

In the second algorithm the spatial decorrelation is carried out by using wavelet 
packet transform with embedded zerotree coding. 
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1.4 Organization of the thesis 

The thesis has been presented in six chapters. In chapter I we introduce the remote 
sensing system, requirement of multispectral imagery and need for compression. We 
briefly discuss the conventional methods for multispectral image compression. 

Chapter II discusses the overall compression system, encoder and decoder. Chapter 
III is concerned with methods of spectral decorrelation. 

In chapter IV a brief review of the wavelet transform and wavelet packet transform 
is given. The method to select the best basis coefficients of wavelet packet library is 
discussed. Also the method of embedded zerotree coding using above transforms is 
discussed. Lastly adaptive arithmetic coding is reviewed , which is being used to code 
the zerotree structure. 

Chapter V gives the simulation results of the algorithms developed in previous chap- 
ters and compares the performance of these algorithms with other compression schemes. 

Chapter VI concludes the method adopted and the results achieved in this thesis. 



Chapter 2 


Multispectral image compression system : An 
Overview 


A simplified version of most compression system can be easily described by a conceptual 
flow through three different stages, transform quantization and compression, fig 2.1 
shows it. This concept is useful in describing the functional parts of a compression 
algorithm. 


uncompressed 

=» 

bit stream 


transformer 


quantizer 


compression 


output 

compressed 

bit 

stream 


Figure 2.1: Basic compression system 


The first step : a transform represents a method to convert image pixels into a form 
that is more amenable to quantization. This transform can take on many different forms 
from linear prediction or orthogonal transform to vision modelling. This is typically 
considered a ” decorrelation” step because in the case of unitary transformation the 
resulting transform coefficients are relatively uncorrelated. The primary purpose of 
this is to allow each of the coefficients to be quantized independently of the others. 
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Most unitary transform also have the tendency to pack a large fraction of the average 
energy of the image into relatively few components of the transform image. Since the 
total energy is preserved , this means many of the transform coefficients will contain 
very little energy and can be highly compressed. 

The transformed pixel values are real numbers which must also be approximated in 
a finite alphabet, or quantized before they can be transmitted. All the distortion from 
a lossy transform coding scheme is introduced at this step. The range of transformed 
value is divided up into numbered subinterval or bins. Any pixel value falling into a 
bin is approximated by bin’s index. 

The last step of the process , the coding step is where the actual compression takes 
place . Having gone through an entropy reducing step, the resultant quantized values 
can be coded to reduce the overall symbol length. 

2.1 MSI properties relevant to compression 

Though the nature of the distortion introduced into the data varies with the type of 
algorithm, all data compression schemes operate by exploiting redundancy inherent to 
data. 


2.1.1 Spatial Properties 


Pixel to pixel correlation is the form of spatial redundancy that is most often exploited 
by single band image compression algorithms. The normalized autocovariance function 
for pixels /(m,n) in an image is defined by 


C(m,n ) 2 


MN 


M-l N—l 

E E 

fc =0 1=0 


/(£,/)/(£: + m ,/ + n )-/*/ 2 

Cl 2 


( 2 . 1 ) 
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where 


i M-l N-l 

n= mn £ £ 

ni iV m= 0 n=Q 

(2.2) 

i M-l N-l 


MN E E^K™)-/*! 

iVliy 771=0 71=0 

(2.3) 


The behaviour of auto covariance function shows the redundancy present in data. 

2.1.2 Spectral Properties 

MSI compression techiniques can be designed to exploit redundancies that exist be- 
tween spectral bands. Provided that the spectral bands are completely registered, 
significant band to band correlation is present even across large difference in wave- 
length. 

2.2 Overview of the compression system 

Here 3 dimension transform based scheme is used to compress the multispectral im- 
agery. 

2.2.1 Encoder 

The encoder block diagram is shown in fig 2.2. 

The compression system is having two modules. One is spectral decorrelation mod- 
ule and other is spatial decorrelation module. This module based compression system 
scheme allows to replace any module by any different coder. 

(a) Spectral decorrelation module 
It consists of three parts. 

(1). Data partitioning module (optional) : 
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Figure 2.2: Encoder block diagram for multispctral image compre ssion 

In the data partitioning module the set of multispectral images are partitioned into 
sets of non-overlapping images : sub-block sets , which are sequentially fed to the KL 
transformation module for spectral decorrelation. This module is necessary for terrain 
adaptive compression system but not required for non-terrain adaptive compression 
system. 

(2). KL transformation module : 

In the KL transformation module the multispectral sub-block set is spectrally decor- 
related to produce a set of eigen planes. The basis function for the KLT are the 
eigen-vectors of the auto covariance matrix associated with multispectral sub-block set. 
The covariancje matrix is estimated first and then quantized to two bytes per element. 
The eigen planes are formed by matrix multiplication of multispectral block and basis 
function (subblocks and associated basis functions for terrain adaptive case) 
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(3). Mapping module : 

The eigen planes are in floating point format and assume both positive and negative 
values. In this module eigen plane set is converted into the 8 bits eigen image set 
via linear/non-linear mapping of each plane into the 0-255 range. These spectrally 
decorrelated eigen images are fed to spatial decorrelation module. 

(b). Spatial decorrelation module 

To spatially decorrelate the eigen image set, the eigen images are sequentially em- 
bedded coded using zerotrees of wavelet coefficients. The wavelet transform is used 
to exploit the spatial redundancy . The main contribution of the wavelet theory and 
multiresolution analysis is that it provides an elegant framework in which both the 
signals, which are either localized in frequency or space can be analyzed with equal 
resolution. This wavelet transformed image is significant map encoded by predicting 
the absence of significant information across scales by exploiting self similarity inherent 
in the images. This uses a new data structure called zerotree. This significance map 
is quantized with successive approximation quantization and further coded by lossless 
data compression scheme which is achieved via adaptive arithmetic coding. 

2.2.2 Decoder 

The block diagram of decoder is shown in fig 2.3. The mapping information and 
covariance matrix are extracted from the received bit stream. The eigen images are 
reconstructed from their compressed bit stream using EZW decoder. The mapping 
information is used to span the 8 bit dynamic range of the decoded eigen images to 
their original range. The inverse KLT basis functions are obtained from the extracted 
covariance matrix via transposing its eigen- vectors. An inverse KLT is performed to 
produce the reconstructed multispectral image set. In terrain adaptive case sub-blocks 



are mosaiced together to reconstruct the original image set. 



Reconsructed 
Multispectral 
Image set 


Figure 2.3' Block diagram of decoder 







Chapter 3 


Spectral decorrelation subsystem 

Removing inherent spectral correlation in the data results in a significant compaction 
of data to be coded. 

3.1 Spectral decorrelation by KLT 

Figure 3.1 illustrate the KLT orthogonal transformation. KLT is the optimum method 
to decorrelate but it does not have any fixed basis. KLT is reviewed below. 

For a real Nxl random vector x, the covariance matrix of the x vectors is defined 
as 

C x = E[(x - m x )(x - m x ) } (3.1) 

where 

m x = E[x] (3.2) 

is the mean vector, E is the expectation operator and the prime (’) indicates trans- 
position. Equations 3.1 & 3.2 can be approximated from the samples by using the 
relations. 

i M 

( 3 - 3 ) 
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original 

spectrallyrorrelated 

images 



spectrallylecorrelated 
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Figure 3.1: Removing spectral correlation via KL transformation 


and 


1 M 

Cx -~M ~ m *)( Xt _ ( 3 - 4 ) 


or equivalently 


1 M t 

C. = — [yjx.i, ] - m x mj 


(3.5) 


let e t and A,, i=l,2, N be the eigen vectors and corresponding eigenvalues of C x . It 

is assumed for convenience in notation that the eigen values have been arranged in 

decreasing order so that A x > A2 > A 3 > Ajv. A transformation matrix whose 

rows are the eigen vectors of C x is given by 


e n 

e 12 

■ ■ &\N 

e 21 

&22 

■ ■ e 2 Ar 


(3.6) 


|_ ZNl e N 2 • • • J 

where e,y is the j th component of the i tfl eigen vector. The KL transform then consists 
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simply of multiplying a centralized image vector (x — m x ) by A to obtain a new image 
vector y; i.e. 

y = A(x - m x ) (3.7) 

Mean of the transformed vector y is equal to zero vector as can be shown directly 


m y = E[y) 
— E[A(x - m x )] 
= AE[x] — Am x 
= 0 


covariance matrix of y, C y can be expressed in terms of C x as 

C y = E[(Ax — Am x )(Ax — Am x ) ] 

= E[A{x - m x ){x - m x ) A.'] 

= AE[(x - m x )(x - m x ) ]A’ 

= ac x a' 

= A 

= diagonal 

C y is a diagonal matrix which shows that all the coefficient representing inter-band 
correlation are zero or the KL transform coefficient are completely uncorrelated so 
this is called optimal transform. 

For the N spectral images to be decorrelated, we form an N by N KL transforma- 
tion matrix composed of the N ordered eigenvectors of the associated auto covariance 
matrix . Each vector of the spectrally-correlated components form an output vector 
of spectrally decorrelated components . The output vectors are placed adjacent to one 
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another, in the same order as the input vectors, to form the stack of the spectrally 
decorrelated eigen planes. 

3.2 Quantization of the eigen planes 

the spectrally decorrelated eigen planes are represented in floating point real numbers. 
They must be quantized to 8 bit, 10 bit or 12 bit to be subsequentially. 

3.2.1 Linear quantization for multispectral imagery 

For multispectral imagery all eigen planes are linearly quantized into 8 bit images. They 
are the eigen images . Alternatively high dynamic range, low order eigen planes may 
be quantized into 10 bits or 12 bits but practically no appreciable gain in performance 
is achieved. For the 10/8 bit quantizer , only the first two eigen planes are quantized 
with 10 bit quantizer. This gives 5% increase in bitrate and reduces mean square error 
from 0.3 to 0.28 compared to nominal 8 bit quantization of the eigen image. 

It can be concluded that dual 10/8 bit quantizer does not offer better results because 
(i) the maximum induced error is only 1 count in 256. for both the cases (ii) the gain 
in quantization round-off is smaller than the error introduced at the next stage (iii) it 
requires more bit rate. 

3.2.2 Nonlinear/Linear quantization for high dynamic range 

For high dynamic range ( 12 bit) imagery, an optimal 8 bit or 12 bit quantizer may be 
used to quantized the first one or two high dynamic range eigen planes. Lower eigen 
planes may be quantized using a linear 8 bit quantizer. The high dynamic range of 
the first few eigen planes typically have an uneven (unequalized) histogram with long 
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narrow tails. A nonlinear quantizer achieves a substantially lower overall quantization 
error by allocating finer quantization steps to the large central portion of the dynamic 
range. 

3.3 Terrain adaptive approach 

A typical multispectral image set obtained from either satellite or airborne collection 
platforms exhibits a number of different terrains such as water, forest, cloud etc. Each 
terrain has unique spectral signature. Thus to achieve the highest compactness of 
spectral information, the spectral transformation parameters must adapt to the local 
terrain characteristics. In the terrain adaptive approach the covariance matrix, from 
which the spectral transformation basis functions are derived, is updated frequently. 
The smaller the block size over which the covariance is updated is , the more efficient 
is the spectral decorrelation process. The drawback with the selection of a small sub- 
block is the resulting increase in overhead bit rate due to an increase in the number of 
sub-blocks and it is more computationally intensive . Resulting eigen images clearly 
indicates the blocking effect. This approach also results in decrease in variance. This 
decrease in variance shows much greater compaction of data and substantially lower 
performance. 

3.4 Overhead Information 

The overhead information at spectral decorrelation module contains the mean and 
covariance matrix and eigen plane quantization parameters. For non-terrain adaptive ( 
no sub-blocking of original data) spectral decorrelation of N band multispectral image, 
overhead is as given below. 
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(a) . Mean : m x requires 2 bytes per band i.e. total bit required is 16N. 

(b) . Covariance matrix is diagonally symmetric only non-redundant half is trans- 
mitted. Size of covariance matrix is NxN and the no. of non-redundant element is 
[N 2 /2 + N/ 2] the total bit requirement is 8[N 2 + N], when 16 bits per minimum value. 

(c) . Quantization parameters consists of scale factor and minimum value. It re- 
quires 16N bits ,if integer value is transmitted for minimum value and 8 bits for scale 
factor. Note that scale of unity is not transmitted. 

Total bit requirement is 

8 + 161V + 8(iV 2 + N) + 16 N (3.8) 


Overhead bit rate B 0 h in bpp is 


boh — 


8 + 40/7 + 8 N 2 
NB 2 


(3.9) 


where the size of the image is BxB. 

For terrain adaptive approach 

The overhead is K 2 [Nonterrain adaptive approach] where K is or ^ * s 

no. of subblock present. 


3.5 Alternative decorrelating schemes for spectral decorre- 
lation 

KL transform has certain disadvantages as 

(1). Computation burden is high as this transform is not having any fixed basis, 
every time covariance matrix has to be estimated and eigen vectors has to be calculated. 
Furthermore no fast algorithm exists for it. 
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(2). Overhead is also great in this case as whole basis information has to be trans- 
mitted along the basis. 

Because of these disadvantages alternative spectral decorrelation techniques are also 
tried. 

3.5.1 Discrete Cosine Transform 

Discrete cosine transform (DCT) can be used instead of KLT. The advantage with 
DCT is that the basis function is fixed regardless of the characteristics of the data. 
Unlike the KLT , it does not require covariance and eigen vector caculation and , as 
such is much easier to represent. The drawback with this scheme is that the spectral 
decorrelation efficiency will be significantly poor. KLT approach including overhead 
will result into lower bit requirement. 

3.5.2 Affine prediction 

This approach is based on prediction in spectral band and this is simpler to set-up at 
satellite. Here we review the concept of affine predictor 



Considering first the case of two bands labelled band 1 and band 2. The vector is 
formed x = (xi,x 2 ) where X\ is element from first band and x 2 is the element from the 
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second band at identical location. 

The affine predictor is obtained by restricting the predictor f(.) of x 2 to be of the 
form 

x 2 = f(xi) = Axi + B (3.10) 

where A and B are scaler quantities. If B is set to zero then affine predictor reduces to 
the well known linear predictor , the best affine predictor employs A and B chosen to 
minimize the distortion over all possible A and B . The solution to the affine prediction 
problem is derived readily as a simple extension of the linear predictor solution and 
given by 

A = £1(x 2 - E(x 2 ))(x, - E(x,))} n) 

C X i 

where 

C n = £[(*! - E(x l ))(x l - E( Xl ))] (3.12) 

and 

B = E[ Xl \ - A[E[x i] (3.13) 

where E is expectation operator. 

Affine predictor for the two band can be extended to more than two bands. For J 
bands consider (xi, x 2 , x 3 , ....xj) as one vector. Here x, is the element from band i. 
Feature band to be quantized is selected as a subset of these J bands. The remaining 
bands are estimated from quantized feature band. 

In the simplest case only one of the band i is chosen as the feature band . Rest each 
of the bands is predicted separately from the quantized feature band. For the J band 
multispectral data set it requires the design of J-l error vectors. This predictor has 
advantage of simplicity in implementation in satellite but this predicts one band from 
only single band. As in multispectral imagery bands are correlated with many other 
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bands , this predictor does not exploits the redundancy present in spectral domain. 
The performance of this scheme is significantly lower than KLT. 



Chapter 4 


Spatial Decorrelation Subsystem 


Spatial redundancy is inherent in each of the bands of the multispectral imagery. Hence 
it is necessary to exploit this redundancy in order to achieve significant compression of 
multispectral data. 

In image coding traditional transform coders such as those using the DCT, decom- 
pose the images into a representation in which each coefficient corresponds to a fixed 
frequency bandwidth where the bandwidth and spatial area are effectively the same for 
all coefficients in the representation. Edge information tends to disperse so that many 
non-zero coefficients are required to represent the edge with good fidelity. However , 
since the edges represent relatively insignificant energy with respect to entire image , 
these transform coders have been fairly successful at medium and high bit rates. At 
extremely low bit rates these coders tend to allocate too many bits to the ’’trends” or 
the signal behaviour that is more localized in frequency but persists over a large num- 
ber of lags in time domain and few bits are left for ’’anamolies” or the signal behaviour 
that is more localized in time or space domains and tends to be wide band in frequency 
domain. As a result blocking artifacts often result. Wavelet transform technique shows 
promise at extremely low bit rates. 
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In this chapter we review the basic concept of time frequency analysis , wavelet 
transform followed by wavelet packet transform. We discuss the wavelet transform 
technique followed by the embedded zerotree coding method to decorrelate spatially 
the eigen images obtained from the spectral decorrelation subsystem. 

4.1 Wavelet transform : A review 

4.1.1 Continuous wavelet transform 

The continuous wavelet transform of signal s(t) by definition is convolution with a 
wavelet i p(t) dilated by a factor a 


J4 / S (a,6) = a f -)dt 

J a 

(4.1) 

W s (a,b ) = a' 1 / 2 j S(w)i/j(aw)e :,wb dw 

(4.2) 


which is equivalent to filtering the signal s(t) with bandpass filter il>(aw), whose band- 
width changes according to scale parameter a, clearly, large scale corresponds to narrow 
smoothing filters that represented a global view of the signal s(t) and small scales cor- 
respond to filters that look into the details of s(t) (i.e. high frequency components). 

Wavelet expansion of the signal s(t) is essentially a decomposition of its frequency 
content using filters of constant relative bandwidth. The signal can be recovered from 
its wavelet coefficients using 


s(t ) = a- 5 ' 2 / / W a {a,b)^(- -)dadb 

J a J b CL 


(4.3) 
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assuming that 

J^(t) = 0 

f°° Mw) 2 

/ | \dw < oo 

Jo w 

As it is the case with the fourier transform where a signal is expanded in terms of 
complex exponentials of different frequencies. A wavelet transform involves dilation of 
a single (mother) wavelet .The choice of a mother wavelet depends on the application, 
where a particular wavelet is chosen based on its time and frequency localization. 

Orthogonality is an important element of wavelet analysis and a mother wavelet is 
orthogonal to its own dilations and translations. Wavelets provide orthonormal basis 
for expansion of functions that are not of single frequency and are therefore ideal for 
characterizing signals with discontinuities. 


(4.4) 

(4.5) 


4.1.2 Properties of wavelets 

1. ip(w )= 0 at w=0, or equivalently f ip(t)dt = 0 i.e. they have zero d.c. components. 

2. They are band pass signals. 

3. They decay rapidly towards zero with time. 

Property (1) is a consequence of admissibility condition of the wavelets, the condi- 
tion that ensures the wavelet transform has an inverse. The rapid decay of ip(t) is not 
necessary theoretically to be a wavelet. However ip(t) in practice should have compact 
support in order to have good time localization. 
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4.1.3 Discrete Wavelet transform 

The wavelet transform parameters can be discretized so that 

= «r /2 / ■w( t ~"5 r ) < 4 -6) 

Jt <2 q 

where 

« = <*“ (4.7) 

6 = na™T (4.8) 

and T is sampling period. The signal s(t) can be recovered from its expansion coeffi- 
cients using 

*« = ^EE (4.9) 

m n ^0 

where A is constant. 


The case where a 0 = 2 is known as the dyadic wavelet transform, where the signal 
s(t) is bandpass filtered using octave band filters. This type of wavelet has the form 

0 m ,n(ifc) = 2- m/ V(2 ~ m k - n ) m,ncz (4.10) 


The discrete wavelet transform of discrete time sequence s(k) is essentially a multires- 
olution characterization of the s(k). Generally we take the discrete wavelet transform 
of a signal that is both time limited and frequency limited. A continuous time signal, 
uniformly sampled satisfies this criterion. A dyadic discrete wavelet transform is essen- 
tially a decomposition of the spectrum of s(k); S(w) into orthogonal subbands defined 

by 


1 


< w < 


1 


2 JT ~ ~ 2-> +i r 

where j=l,2,....J and T is sampling period associated with s(k). 


(4.11) 
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4.1.4 Time-frquency representation 

In the time-frequency analysis of a non-stationary signal, there are two conflicting 
requirements. That is the window width must be long enough to give the desired fre- 
quency resolution but must be short enough so as not to blur the time dependent events. 


Define the bandwidth A / of the filters as 


.2 / m /) m 

S \GU)M 


and spread in time is given by 


2 / jt 

/ M tWdt 


(4.12) 


(4.13) 


Now resolution in time and frequency cannot be arbitrary small, because their product 
is lower bounded by 

At A/ > — (4.14) 

47 T 

Short time fourier transform (STFT) uses fixed window width. So this represents fixed 
time and frequency resoluton. However by varying the window used one can trade res- 
olution in time for resolution in frequency. Time- frequency diagram shows the effect 
of shift in time r and modulation by w 0 in frequency. Scaling the function by a or 
/'(t)=/(af) results in l' t =~It and l' w —al w , that is both shape and localization of the 
tile have been affected. 


For Wavelet transform for the large a basis function ip a b(t) = becomes a 

stretched version of the prototype wavelet that is a low frequency function, while for 
small a, the basis function becomes a contracted wavelet that is a short high frequency 
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Figure 4.1: Time- frequency representation of wavelet transform 

function. The frequency resolution of the wavelet transform involves a different trade- 
off : at high frequency, the wavelet transform is sharper in frequency. Time frequency 
diagram of wavelet transform is shown in fig 4.1. 


4.1.5 Multiresolution Analysis 

The idea of multi resolution analysis (MRA) is very similar to subband decomposition 
and coding. In the dyadic case the wavelet transform is actually an octave band filter, 
can be interpreted as a constant Q filtering with a set of octave band filters followed 
by sampling at the respective Nyquist frequencies. 
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Multiresolution analysis is the decomposition of a signal into components of different 
scale (frequency) of 2~ m , m integer. Associating with each scale (frequency band) is a 
subspace V m . These subspaces are time functions satisfying 

1. Containment 

Vi C Vi C Vo C V_i C V_ 2 The subspace begin with null space and expands 

in scale of two to reach the space of all square integrable function. In particular if 
x(t ) 6 V-i then x(2t) £ V-\. 

2. Existence of orthonormal scaling functions 

There exists a scaling function (f>(t ) 6 Vo s.t. (j> m ,n(t) = 2~ m ^ 2 (j)(2~ m t — n) , n integer 
forms an orthonormal basis that spans V m . i.e. Vo is the space of all band limited func- 
tions with frequency interval (— 7r, 7 r), set of function cf>{t — n) forms an orthonormal 
basis for Vo- Similarly V-\ is the space of band limited functions with frequency in the 
interval (— 27r,27r). Here is called the scaling function because it derives an ap- 
proximation in Vo of signals in V-\. The scaling function <^ m>n (t) forms an orthonormal 
basis in V m can be exactly represented as a linear combination of <j> m ,n(t). 

3. Basis function defined by two scale difference equation: 

Since <f> 0n (t) spans Vo and ) spans V_! and VL X contains V£> , cf> 0 , o = is a linear 

combination of <A_i, n (i) = \/2<j)(2t — n) 

i.e. 

<S(t) = 2£9W(2<-0 (4.15) 

1=0 

which is a two scale difference equation. Now call W 0 the space of bandpass functions 
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with frequencies in the interval (— 27r , 7r) U (tt , 2tt) then 

VL X = Vo © ^0 (4.16) 

that W 0 is the orthogonal complement in V_i of Vo- In other words K. x is equivalent 
to V 0 plus some added detail corresponding to Wo- 

Generalizing VL t is the space of band limiting functions with frequency in the in- 
terval (— 2 _, 7r, 2*7 t) then we get the relations 

V t C K_i * € Z (4.17) 

and 

K-i = V t @W l (4.18) 

where W t is the space of bandpass functions with frequencies in the interval 
(_2"(‘ +1 )jr, -2 -t r) U (2-tt, 2-(*' +1 )7r). 


iterating, 

V t = W t+1 0 W t+2 0 W t+ 3 © (4.19) 

finally the direct sum of all W 3 ' s j=i+l,i+2, oo is equivalent to the space of 

square integrable function band limited to (— 2 _ ^ +1 ^7r, 0) U (0, — 2 - ^* +1 ^7r) . 

It can be seen that V 0 can be decomposed in the following manner 

y 0 = Wi 0 w 2 0 Wj^eWjQV, (4.20) 


as can be seen in fig 4.2. 
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Figure 4.2: Subspace decomposition of square integrable function 

Now since W 0 C F_i, then a function ip(t) called the wavelet function, whose 
translates ip(t — n ) spans Wq can also be written as a linear combination of 
which spans V_i thus 

W=2'E'h(I)<K2t-l) (4.21) 

from the inner product preserving property 

< 0m,n(O^J, k(t) >= Sm-jSn-k (4-22) 

These ip m<n {t) generate orthogonal subspaces W m . 

(4). Existence of orthonormal wavelet function : 

Wavelet function rj> m , n (t) = 2 _m/2 ^(2 ~ m t - n) are orthonormal with 

< VwWj.fcM >== ?>m-j{>n-k (4-23) 

These ipm, n{t) generate orthogonal subspaces W m . 

4.1.6 Wavelets and filterbanks 

RefFering to the two scale equations 4.15 k 4.21 it is known that the coefficient g(l) 
and h(l) corresponds to lowpass and highpass filters, as two scale equation are the 
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interpolation equation by <j>(2x) of the perfect half band lowpass filter. Actually if the 
filter g(l) is exactly lowpass then h(l) is exactly highpass filter. 

The decomposition of V-i into Wq, W\, W 2 , W3, etc. is essentially a wavelet trans- 
form, since it splits one of the resulting half spaces in two etc. Thus if VLi is the space 
of functions, band limited to (— 27 t,27t) then Vo and W 0 are the spaces of functions 
bandlimited to (—x,x) and (— 2-7T, — x) U (x,2x) respectively. That is by iteration the 
discrete system in fig 4.3 computes exactly the discrete wavelet transform into octave 
band. 
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Figure 4.3: Octave band analysis filter bank 


Octave band filter bank computes the inner products with the basis functions for 

ITj, W 2 Wj and Vj i.e. we get the orthogonal projection of the input signal onto 

Wi,W 2 Wj and V r That is, the input is decomposed into a very coarse resolution 

(which exists in Vj) and added details (which exists in W t ,i = l,2...j ) We will call 
Vj’s approximation spaces and IVj’s detail spaces. Then the process of building up the 
signal is intuitively very clear - one starts with its lower resolution version belonging 
to Vj and adds up the details until the final resolution is reached. 
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4.1.7 Two dimensional wavelets 


The idea is to form a 1-D sequence from the two dimensional image row sequence, do a 
one dimensional MRA, restore the MRA outputs to a 2-D format and repeat another 
MRA to the 1-D column sequences. The two steps of restoring to a 2-D sequence and 
forming a 1-D column sequence can be combined efficiently by appropriately selecting 
the proper points directly from the 1-D MRA outputs. As seen from fig 4.4 after the 
1-D row MRA each low pass and high pass output goes through the 2-D restoration 
and 1-D column formation process and then move on to another MRA. Let £i ,<2 be 
the 2-D co-ordinates and L=lowpass, H=highpass . Then the 2-D separable scaling 
function is 


2D 

form 2 dim 

row 

IMAGE 

sequence 


G 




Figure 4.4: 2-D Multiresolution analysis 


1^2) = ^(*1)^2) 


and the two dimensional separable wavelets are 


LL (4.24) 




LH 


(4.25) 
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^ 3 \ti,t 2 ) = 0(ti)<£(t 2 ) 

HL 

(4.26) 


HH 

(4.27) 


4.1.8 Wavelet packets and generalized filter banks 

In MRA ,the input spectrum at each stage is always split into two bands at a time; the 
higher band becomes one of the outputs while lower band is again split into two bands 
etc. Now each splitting could be into several bands at a time. In addition, there could 
be a furthur splitting of the higher bands as well. There is a wavelet interpretation 
to this generalization of MRA and the outputs are called wavelet packets. Discrete 
wavelet transform providing an octave band , constant Q analysis are rather restrictive 
in the available frequency resolution. Wavelet packets, through arbitrary band splitting 
can have frequency resolutions different from the octave band constant Q scheme. An 
adaptive system can then choose the most suitable resolutions (combination of wavelet 
packets ) to represent a particular signal. 


The defining two-scale equation for wavelet packets are 

^ k \t) = 2Y^g{l)^ k \2t-l) (4.28) 

/= 0 
P-1 

^ 2k+l )(f) = 2 h{l)^ k] {2t - l) (4.29) 

/=o 

with initial condition 

^ (1) w = m (4-30) 

where g(l) and h(l) are the same filter coefficients as those for MRA. Wavelet packets 
come from a linear combination of a mother wavelet i^(t) as depicted in fig 4.5. In 
this block denotes linear combination of input in the form of 4.28 and 4.29. Fourier 
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Figure 4.5: Wavelet packet generation 
transform of a wavelet packet is 

V> J ‘(uO = n G(fw£) (4.31) 

v> 2,t, w = «(f)n G (|w|).* = !. 2 < 4 - 32 ) 

other ipW(w) follows similarly. 


Octave band filter banks can be generalized to arbitrary tree structures, starting 
from a single two channel filter banks, all the way from full grown tree of depth J. 
Figure 4.6 shows all possible tree structure of depth less or equal to two. 

In particular full tree, which yields a linear division of the spectrum similar to the 
short time Fourier transform, and the octave band tree, which performs a two step 
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Figure 4.6: Tree structures of depth less than or equal to two 

discrete time wavelet series expansion. Such arbitrary tree structure, introduced as a 
family of orthonormal bases for discrete time signals, known as wavelet packet defined 
above. The potential of the wavelet packet lies in its capabilities to offer a rich menu 
of orthonormal bases from which the ’’best” can be chosen. 

Denote the equivalent filters by g 3 t [n],i = 0,1,2 ,2 J-1 . In other words g J t [n ] is 

the I th equivalent filter going through one of the possible paths of length j. 

In the general case, with filter banks of depth J, it can be shown that, counting the 
no-split tree, the number of orthonormal bases satisfies 

Mj = Mj_ x + 1 (4.33) 

Among this myriad of bases, there are the STFT like bases given by 

W 0 = [ n - 2 J k], ,g%l[n - 2 J k] keZ (4.34) 

and the wavelet like basis 

W\ — — 2 k],g^\n — 2 2 k] ,g[ J \n — % J k],go f^[n — 2 J k\ k € Z (4.35) 

It can be shown that the sets of basis funtions in all other bases generalized by the 
filter bank tree, are orthonormal. Therefore, the impulse responses of all equivalent 
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filters and their appropriate shifts form an orthonormal bases for l 2 (Z). 


Time-frequency analysis performed by various filter banks are shown in fig 4.7. 



(a) 


(b) 


( c ) 


Figure 4.7: Time-frequency analysis of different filter banks (a). Full tree or STFT 
(b). Octave band tree or wavelet series (c). Arbitrary tree or one of the possible wavelet 
packet 


4.1.9 Adapted subband coding 

Wavelets decorrelate pictures which are close to self similar. It is not clear that any fixed 
choice of subbands will contain suitable templates but it is possible to use a library 
of bases of wavelet packets which are efficiently encoded superpositions of wavelets. 
These adapted subband bases come with a natural quadtree organization and some 
remarkable orthogonality properties. 
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Figure 4.8: Quadtree decomposition of wavelet packets 


A large library of adapted subband bases can be limited by retaining all amplitudes 
in the quadtree. The amplitudes produced at each stage are correlations of the signal 
with compactly supported oscillatory functions called wavelet packets. From the tree 
W of subspaces we may choose a basis subset defined as a collection of mutually or- 
thogonal subspaces W E W, or lists of pairs (n,m) which together span the root. For 
2-D image various decomposition into subbands is shown in fig 4.9. 



W (b) (c) 

Figure 4.9: (a). Basis of subbands (linear) at one level (b). Multiresolution or wavelet 
basis (c). Adapted subband or wavelet packet basis 
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It is possible to introduce a cost function and pick a ’’best” wavelet basis. 

A map jj, from sequence x t to R is called an additive information cost function if 
M(0) = 0 and /r(x t ) = /x(x t ) 

Motivated by ideas from signal processing and communication theory we were led 
to measure the ’’distance” between a basis and a function in terms of the Shannon 
entropy of the expansion. More generally, let H be a Hilbert space, let v € -flMMI = 1 


and assume that H is an orthogonal direct sum H = ® £ H{ We write v = © v t for 
the decomposition of v into its H t components and define 

= (4.36) 

as a measure of distance between v and the orthogonal decomposition, e 2 is character- 
ized by the Shannon equation, which is a version of Pythagoras’ theorem. Let 

iT = ©(£#') (4.37) 

= //+©//_ (4.38) 

Thus H' and H } give orthogonal decompositions. 

H+ = E H. = E tfj.Then 

= (4.39) 

= || K+ ||V(jj^jj;ir) + ||T>.||V(jj^;B,) (4.40) 


That is the Shannon equation for entropy, if we interpret ||Ph+u|| 2 to be, as in quantum 
mechanics, the ’’probability” of v being in the subspace H+. This equation enables us 
to search for a smallest entropy spatial decomposition of a given vector. 
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The following algorithm finds the best basis based^ on entropy. Let 

H W (S) = — x 2 logx 2 (4.41) 

x£S w 

where H W (S ) measures the expense of including W in the decomposition used to rep- 
resent the picture S. 

Define the best basis for representing S to be the basis subset Bo which minimizes 
Y^weB H W (S) over all basis subset B C W . 

For a predetermined deepest level L, label as "kept” each subspace at level L, i.e. 
the subspace indexed by (n,L) for 0 < n < 4 L . Next, set the level index m to L-l. 
Compare the information cost of the subspace W(n,m) with the sum of the information 
costs of its children W(4n,m+1), W(4n+l,m+l), W(4n+2,m+l), W(4n+3,m+l). If 
the parent is less than or equal to the sum of the children, then mark the parent as 
’’kept”. This means that by choosing the parent rather than children, we will have 
less entropy in the representation of S. On the other hand if the sum of the children is 
less than the parent, leave the parent unmarked but attribute to the sum of children’s 
information costs. By passing this along, prior generations will always have their in- 
formation costs compared to the least costly collection descendants. 

After all the subspaces at level m=L-l have been compared to their children, decre- 
ment the level index and continue the comparison. At each level we are compaxing 
the information cost of a node to the sum of the lowest information costs obtainable 
by any decompositions of its four children. We can proceed in this way until we have 
compared the root W(0,0) to its four children. It is claimed that the topmost ’’kept” 



nodes in depth first order constitutes a best basis. 


4.2 Zerotree Coding 

4.2.1 Embedded coding 

An embedded code [25] represents a sequence of binary decisions that distinguish an 
image from the null, or all gray, image. Since, the embedded code contains all lower rate 
codes embedded at the beginning of the bit stream, effectively, the bits are placed in 
order of importance. Using an embedded code, an encoder can terminate the encoding 
at any point thereby allowing a desired bit rate to be met exactly. When the desired 
bit rate is met, the encoding simply stops. Similarly, given a bit stream, the decoder 
can cease decoding at any point and can produce reconstruction corresponding to all 
lower rate encodings. 

4.2.2 Zerotree coding of wavelet coefficients 

In a hierarchical subband system, with the exception of the highest frequency subbands, 
every coefficient at a given scale can be related to a set of coefficient at next finer scale. 
The coefficient at the coarse scale will be called the ’’parent” node, and all the coefficient 
corresponding to the same spatial or temporal location at the next finer scale are called 
’’child” nodes. For a given ’’parent” node, the set of coefficients at all finer scales 
corresponding to the same location are called ’’descendants”. Similarly, for a given 
child , the set of coefficient at all coarser scales of similar orientation corresponding to 
the same location are called ’’ancestors”. 

For a QMF-pyramid subband decomposition , the parent child dependencies are 
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Figure 4.10: Parent -child dependencies of subbands:Note that the arrow points from 
the subbands of parent to the subbands of the children. The lowest frequency subband 
is the top left, and the highest frequency subband is bottom right, also shown is a 
wavelet tree consisting of all of the descendants of a single coefficient in hh3. The 
coefficient in hh3 is a zerotree root if it is insignificant and all of its descendants are 
insignificant. 

shown in 4.10. With the exception of the lowest frequency subband, the parent child 
relationship is defined such that each parent node has three children. The scanning 
of the coefficient is performed in such a way that no child node is scanned before its 
parent. For a N-scale transform , the scan begins at lowest frequency subband, denoted 
as LLu , and scans subbands HL n ,LH HH^ at which point it moves on the scale 
N-l,etc. The scanning pattern for a 3-scale QMF-pyramid is shown in figure 4.11. Note 
that each coefficient within a given subband is scanned before any coefficient in the 


next subband. 
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Figure 4.11: Scanning order of the subbands for encoding a significant map. Note 
that all positions in a given subband are scanned before the scan moves to the next 
subband. 

4.2.3 Zerotree data structure 

This data structure has four symbols. 

1. Positive significant (POS) : The symbol POS is coded when the coefficient is 
positive and the magnitude is more than the threshold value. 

2. Negative significant (NEG) : The symbol NEG is coded when the coefficient is 
negative and the magnitude is more than the threshold. 

3. Isolated zero (IZ) : The symbol IZ is coded when the magnitude of the coefficient 
is insignificant, i.e. less than the threshold, but has atleast one significant descendant. 
4. Zerotree root (ZTR) : The symbol ZTR is coded when the coefficient and all its 


descendants are insignificant. 



4.2.4 Compression of significance maps using zerotrees of wavelet coef- 
ficients 

The zerotree is based on the hypothesis that a wavelet coefficient at a coarser scale is 
insignificant with respect to a given threshold T, then all wavelet coefficients in the 
same spatial location at finer scales are likely to be insignificant with respect to T. 
Empirical evidence suggests that this hypothesis is true. When encoding the finest 
scale coefficients, since coefficients have no children, the symbols in the string come 
from a 3-symbol alphabet, whereby the zerotree symbol is not used. The flowchart for 
the decisions made at each coefficient are shown in figure 4.12. 



Figure 4.12: Flow chart for encoding a coefficient of the significant map 


Zerotree coding reduces the cost of encoding the significant map using self similar- 
ity . Even though the image has been transformed using a decorrelating transform the 
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occurrences of insignificant coefficients are not independent events. More traditional 
techniques employing transform coding typically encode the binary map via some form 
of run-length coding. Unlike the zerotree symbol, which is single terminating symbol 
and applies to all tree depths, run-length encoding requires a symbol for each run- 
length which must be encoded A technique that is closer in spirit to the zerotrees is 
the end-of-block (EOB) used in JPEG, which is also a terminating symbol indicating 
that all remaining DCT coefficient in the block are quantized to zero. To see why 
zerotree may provide an advantage over EOB symbols, consider that a zerotree repre- 
sents the insignificance information in a given orientation over an approximately square 
spatial area at all finer scales upto and including the scale of the zerotree root. Be- 
cause the wavelet transform is a hierarchical representation, varying the scale in which 
a zerotree root occurs automatically adapts the spatial area over which the insignif- 
icance is represented. The end of block (EOB) symbol , however, always represents 
insignificance over same spatial area, although the number of frequency bands within 
this spatial area varies. Given a fixed block size, such as 8x8, there is exactly one 
scale in the wavelet transform in which if a zerotree root is found at that scale, it 
corresponds to same spatial area as the block of the DCT. If a zerotree root is found 
at the coarser scale, then the insignificance pertaining to that orientation can be pre- 
dicted over a large area.Zerotree approach can isolate interesting nonzero details by 
immediately eliminating large insignificant region from consideration. In this zerotree 
approach the focus is on reducing the cost of encoding the significant map so that, for a 
given bit budget, more bits are available to encode expensive coefficients. In practice , a 
large fraction of the insignificant coefficients are efficiently coded as part of the zerotree. 
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4.2.5 Zerotree like structures in DCT and wavelet packets 

The concept of predicting the insignificance of coefficients from low frequency to high 
frequency information corresponding to the same spatial localization is a fairly general 
concept and not specific to the wavelet transform configuration. Zerotrees are equally 
applicable to quincunx wavelets, in which case each parent would have two children 
instead of four except for the lowest frequency, where parents have a single child. 

Similar approach can be applied to linearly spaced subband decomposition such as 
the DCT, and to more general subband decompositions such as wavelet packets and 
Laplacian pyramids. For example one of the possible parent-child relationship for lin- 
early subbands can be seen in fig 4.13. 
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Figure 4.13: One of the possible parent-child dependencies in linear spaced subbands 
systems such as the DCT. Note that the arrow points from the subband of the parents 
to the subband of the children. The lowest frequency subband is the top left, and the 
highest frequency subband is bottom right. 


With the use of linearly spaced subbands, zerotree like coding loses its ability to 
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adapt the spatial extent of the insignificance prediction. Nevertheless it is possible for 
zerotree like coding to outperform EOB coding since more coefficients can be predicted 
from the subband along the diagonal 

For the case of wavelet packets, the situation is more complicated, because a wider 
range tiling of the space-frequency” domain are possible. In that case, it is not always 
possible to define similar parent-child relationships because a high frequency coefficient 
may in fact correspond to a larger spatial area than a co-located lower frequency coef- 
ficient. As can be seen from figures 4.7 & 4.9 , where arbitrary tree is used as the basis, 
in which it is clear that high frequency coefficient is corresponding to more spatial 
area than the low frequency coefficient. The best basis of the wavelet packet library is 
also an arbitrary tree subband decomposition, so it is not possible to define similar 
parent-child relationship. 

It can be seen as zerotrees are predicting the insignificance of coefficients from low 
frequency to high frequency information. This prediction can be applied in arbitrary 
subband configuration as in the case of linear spaced subband discussed above, of 
course this does not exploit the spatial adaptivity. Considering zerotrees as merely 
a prediction technique, it is found possible to use zerotrees data structure for coding 
arbitrary tree decomposed or more specifically the bestbasis decomposed images. That 
is irrespective of what the subband decomposition is , high frequency coefficients are 
being predicted in the same way as in the wavelet transform decomposition case. This 
scheme is not as efficient as in the wavelet transform cofiguration but for the bestbasis 
case this gives almost the equivalent performance. 
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Summarizing it can be said as it is possible to define zerotree structure for other 
subband decomposition but it is most efficient for the case of wavelet transform de- 
composition. 


4.2.6 Successive approximation quantization 

In the previous section we described a method of encoding significance maps of wavelet 
coefficients that, at least empirically, seems to consistently produce a code with a lower 
bit rate than either the empirical first order entropy, or a run-length code of the sig- 
nificant map. 

To perform the embedded coding successive approximation quantization is applied. 
The successive approximation quantization sequentially applies a sequence of thresh- 
olds T 0 , T’i, ....,Tyv-i to determine significance, where the thresholds are chosen so that 
T, = 7\=i. 

A wavelet coefficient x is said to be insignificant with respect to a given threshold 
T if |x| < T . The initial threshold T 0 is chosen so that \X : \ < 2T 0 for all transform 

coefficient Xj. 

During the encoding (and decoding) two link-lists, one for dominant pass and the 
other for subordinate pass, of wavelet coefficients are maintained. At that point in 
the process ,the dominant list keeps track of the coefficients that have not found to 
be significant in the same relative order as the initial scan. This scan is such that 
the subbands are ordered, and within each subband, the set of coefficients are or- 
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dered. Thus using the ordering of the subbands shown in figure 4.11, the subordinate 
list contains the magnitudes of those coefficients that have been found to be significant. 

During the dominant pass coefficients that have not been found to be significant 
in the previous scan, are compared to the threshold T t to determine the significance, 
and if significant , their sign. The significance map is then zerotree coded. Each time a 
coefficient is encoded as significant, (positive or negative), its magnitude is appended 
to the subordinate list. The coefficient that had been determined to be significant in 
the previous scan is considered as insignificant in all the following dominant passes at 
smaller thresholds. 

During a subordinate pass, the width of the effective quantizer step size, which 
defines an uncertainty interval for the true magnitude of the coefficient, is halved. For 
each magnitude on the subordinate list, this refinement can be encoded using binary 
alphabet with a 1 symbol indicating the lower half. The string of symbols from this 
binary alphabet that is generated during a subordinate pass is then entropy coded. 

This process continues alternately between dominant passes and subordinate passes 
where the threshold is halved before each dominant pass. 

In the decoding operation, each decoded symbol, both during a dominant and a 
subordinate pass, refines and reduces the width of the uncertainty interval in which the 
true value of the coefficient (or coefficients in the case of a zerotree root ) may occur, 
the reconstruction value can be any where in that uncertainty interval. For minimum 
mean square error distortion, one could use the centroid of the uncertainty region using 
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some model for the PDF of the coefficients. However a practical approach, that is used 
in the experiments , which is also MINMAX optimal, is to simply use the center of the 
uncertainty interval as the reconstruction value. 

The encoding stops when some desired terminating condition is met, such as when 
the bit budget is exhausted. 


4.3 Arithmetic Coding 

Arithmetic coding is a lossless compression technique that produces an encoded string 
for an input string of symbols and model. This encoded string represents a fractional 
value R for the range 0 <= R < 1. 

Arithmetic coding [29] is superior to the well known Huffman method in many re- 
spects. It represents information as compactly as Huffman code. It is known that each 
symbol in the input string is represented as an integral number of bits in the encoding, 
then Huffman coding achieves ’’minimum redundacy”. In other words , it performs op- 
timally if all symbol probabilities are integral powers of 1/2. But in practice this is not 
so. Arithmetic coding dispenses with this restriction that each symbol be represented 
as an integral number of bits. 

In arithmetic coding, a message is represented by an interval of real numbers be- 
tween 0 to 1. As the message becomes longer , the interval needed to represent it 
becomes smaller, and the number of bits needed to specify that interval grows. Succes- 
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sive symbols of the message reduce the size of interval in accordance with the symbol 
probabilities generated by the model. The more likely symbols reduce the range by 
smaller amounts as compared to unlikely symbols and add fewer bits to the message. 
Before transmitting a message the range of the message in the entire interval is [0,1), 
and the half open interval is denoted by 0 <= x < l.As each symbol is processed, the 
range is narrowed to the portion allocated to the symbol. 

We can divide models for the arithmetic coding into two categories. 1. Fixed model 
and 2. Adaptive model 

In the fixed model, frequency of symbol occurences are taken from sample text. 

In the adaptive model , frequencies are initialized to some value during encoding, 
and these frequencies are updated on the basis of symbol frequencies observed in the 
input string. 

Arithmetic encoder operates successively on each data symbol, determines the con- 
text ( i.e. which relative frequency distribution applies to the current event) and 
generates the code string. 

To represent the magnitude R of the encoded string in the interval [0, 1) great pre- 
cision is required. Fortunately this magnitude need not be given all at once. At any 
stage the upper and lower bounds for R are available as a finite no. of digits. These 
digits are left shifted as they become identical and new digits are brought at the low- 
significant end. 



50 


4.3.1 Arithmetic coding in context of zerotrees 

Note that the particular alphabet used by the arithmetic coder at any given time con- 
tains either 2,3 or 4 symbols depending on whether the encoding is for subordinate pass 
, or dominant pass with no zerotree root symbol, or a dominant pass with a zerotree 
root symbol. There is advantage in adapting the arithmetic coder. Since there are 
never more than four symbols, all the possibilities typically occur with in a reasonably 
measurable frequency. This allows an adaptation algorithm with a short memory to 
learn quickly and track continuously changes in symbol probabilities. This adaptivity 
accounts for some of the effectiveness of the overall algorithm. On the other hand in 
cause of algorithms that donot use succesive approximation, several events are needed 
before an adaptive entropy coder can reliably estimate the probabilities of unlikely 
symbols. 

Once the type of model (adaptive model for the present case) is fixed for arithmetic 
coding, maximum frequency count is the critical parameter for the coding, because it 
affects underflow, overflow and learning rate for adaptation. Arithmetic coding works 
by scaling the cumulative probabilities given by the model onto the interval [low, high] 
for each character encoded.If they are very close together , then there is possibility 
of mapping different symbols in the same interval. Therefore , the interval should 
atleast be as large as possible. It should not be too large to cause overflow. Learning 
rate for adaptation is inversely proportional to the maximum histogram count. Again, 
very small symbol set from zerotree coding is adavantageous in choosing maximum 
histogram count. A maximum frequency count of 256 is used in taking account all the 
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factors given above. 


4.4 Overhead information of spatial decorrelation subsystem 

This contains only 8 bytes header. After this header entire bit stream is arithmetically 
encoded by a single arithmetic coder with an adaptive model. This header contains 
(1). No. of wavelet scales (2). Dimension of the image (3). The initial threshold (4). 
Mantissa. 


4.5 Bit rate assignment for eigen images 

Compression performance is strongly influenced by the selected bit assignment scheme 
for the spectrally decorrelated eigen images. Since the variances of the ordered eigen 
planes decrease almost exponentially , it may be suggested to code the eigen images 
at rates proportional to the logarithm or squareroots of their variances. Although this 
bit assignment strategy may result in lowest MSE, it is not suitable for multispectral 
imagery, the subtle variations in the spectral signatures of certain terrains manifest 
themselves in the lower eigen images with the same level of accuracy as the others , 
irrespective of their variances of the resulting overall mean square coding error . In 
general the criterion for bit assignment should induce a uniform coding error on all the 
reconstructed eigen planes. 


These results are derived from an 
derived from theoretical principles. 


extensive set of experimentation and cannot be 
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To implement the optimum bit assignment scheme, an arbitrary MSE for the first 
eigen image is selected. The effective MSE for the remaining eigen images are calcu- 
lated as 


MSE(n) = L( = 2 , 3,4 ( 4 . 42 ) 

Since it is not known in advance what MSE results for a particular eigen image at 
given bit budget, it is necessary to look-up table of MSE vs. bytes be transmitted for 
each eigen image. 

In general it is beneficial to replace some of the low variance eigen images by their 
mean values. This results in (1). good performance in terms of MSE (2). considerable 
savings in computation and power requirement, since fewer eigen images need to be 
coded. 

The most simple method is a compromise between complexity and performance. In 
which bit rate is assigned to eigen images based' on variance and the initial quantization 


error. 



















Chapter 5 


Experimental Results and Discussion 


The present chapter discusses the experimental results of the following multispectral 
image compression schemes : 

(1) .KL transform followed by embedded zerotree coding using wavelet transform 

(2) .KL transform followed by embedded zerotree coding using wavelet packet trans- 
form coefficients. 

Each of these schemes were examined using two different sets of images. The image 
sots used are : 

(1) . A semi-urban image 

(2) . An urban image 

Figures (imgl ) Sz (img2) show the original image sets 1 & 2 respectively. The test 
images contain diverse range of natural and urban terrains and as such are difficult for 
compression. The specifications of the test image sets 1 & 2 are given below . 

(a). Spectral resolution - 

Each set is a 4 band image in the following frequency range : 

Band 1 0.45 - 0.52 fim 
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Band 2 - 0.52 - 0.6 pin 
Band 3 0.6 - 0.7 pm 

Band 4 - 0.7 - 0.8 pm 

(b) . Spatial resolution- 

The spatial resolution measured in terms of ground sample distance (GSD) is 36.25m. 

(c) . Radiometric precision- 

Each image pixel is represented by 7 bpp or 0-128 gray levels. 

5.1 Measures for compression system efficiency 

The correlation coefficient is a convenient and useful method to measure the inherent 
spectral decorrelation. The correlation coefficient matrix is defined as normalized co- 
variance matrix. That is, the correlation coefficient for each pair of bands is equal to 
their covariance value divided by the squareroot of the product of their variances. 


Objective measures to judge the quality of the reconstructed images are peak signal 
to noise ratio and average mean square error. As is common in the image coding 
literature [11], the square of the maximum intensity is used as the definition of signal 
power and the values are reported in dB units. The mean square error of a reconstructed 
image of size NxN is defined as 


MSE 


1 

A P 


EE 

i=i j=l 


X, 


(5.1) 


where Xij is the pixel at the i lh row and j th column in the image and x tJ is its recon- 


structed value. 
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The power ratio for 7 bpp image set is given by 

127 2 

P ~ MSE 

The value in dB units of PSNR is given by 


PS NR = lOlogiop 


The bit rate for the image set is given by 


bitrate(bpp) = 


Total no. of bytes transmitted * 8 
Total no. of pixels 


(5.2) 


(5.3) 


(5.4) 


5.2 Spectral Decorrelation Efficiency 

It is well known that KL transform is an optimum method to spectrally decorrelate the 
multispectral images. Figures (img3) and (img4) show the spectrally decorrelated eigen 
images of test image set 1 & 2 respectively. The compaction of the data as a result of 
KLT operation is clearly evident. It is observed that more than 90 % of the energy or 
information content of the data sets resides in the first two eigen images and more than 
98 % of the energy in the three eigen images. The remaining eigen image has very little 
information content of the data and as such require substantially fewer bits to be coded. 


A convenient approach to measure the amount of data compaction is to measure 
variances of the eigen images. The variance of an image reflects its busyness or infor- 
mation content. 

Figures 5.1, 5.2 and 5.3, 5.4 show the covariance and correlation coefficient matrix 
of image set 1 and image set 2 respectively and figures 5.5, 5.6 and 5.7, 5.8 show the 
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covariance and correlation coefficient matrix for spectrally decorrelated eigen images 
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Figure 5.1: Covariance matrix of original semi-urban image set 


1 

0.91 

0.9 

0.24 

0.91 

1 

0.96 

0.35 

0.9 

0.90 

1 

0.29 

0.24 

0.35 

0.29 

1 


Figure 5.2: Correlation coefficient matrix of original semi-urban image set 
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Figure 5.3: Covariance matrix of original urban test image 
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Figure 5.4: Correlation coefficient matrix of original urban image set 



Figure 5.5: Covariance matrix of eigen images of semi-urban test image 


It is observed that the oir-diagonal values of correlation coefficient matrix of eigen 
images are zero. That is, the eigen images are completely decorrelated. The variance 
of eigen images drops almost exponentially with the order of the eigen images. Plots 
L and 2 show the normalized variance with respect to bands no. for image sets 1 and 
2 respectively. The steeper the drop in the variance in moving from low to higher or- 
der eigen images, more efficient data compaction and spectral decorrelation is achieved. 


To compare the performance of KLT with other spectral decorrelation schemes , 
experiments are done on these test images using DCT and affine prediction . Tables 
5.1 & 5.2 show the variation of normalized variance with respect to band no. for KLT, 
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band no. 

variance 

1 

90.43 

2 

40.54 

3 

1 .034 

4 

0.79 


( a ) 


band no. 

variance 

1 

50.92 

2 

39.35 

3 

31.82 

4 

1 1 .38 


(h) 


band no. 

variance 

1 

84.07 

2 

13.48 

3 

10.14 

4 

1.47 


(<•) 


Fable 5.1: Comparison of spectral decorrelation efficincy of different spectral decorre- 
lation techniques.The tables show the variance of eigen planes (a). when KLT is used 
(b).when DOT is used (c).when affine predictor is used to spectrally decorrelate the 

original image set 1 
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band no. 

variance 

1 

132.678 

2 

43.26 

3 

3.58 

4 

0.60 


(a) 


band no. 

variance 

1 

112.944 

2 

27.32 

3 

22.32 

4 

17.87 

0>) 

j 

band no. 

variance 

1 

98.564 

2 

31.36 

3 

28. 13 

4 

4.26 


(<=) 


Table 5.2: Comparison of spectral decorrelation efficiency of different spectral decor- 
relation techniques.The tables shows the variance of eigen planes l).when KLT is used 
2). when DCT is used 3).when affine predictor is used to spectrally decorrelate the 

original image set 2 
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figure 5.6: (correlation coefficient matrix of eigen images of semi-urban test image 
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Figure 5.7: Covarince matrix of eigen images of urban test image 


IX .'T and affine prediction spectral decorrelation schemes for test images 1 & 2 respec- 
tively. It is observed that for the KLT scheme two eigen images are having negligible 
variance whereas in DCT and affine predictor scheme only one band is having negligible 
variance. Referring to the plots L and 2, which also show the variation of normalized 
variance with respect to band no. for the DCT and affine predictors, it is observed 
that the KL transform is the most efficient of all other transforms. 


Although KL transform has no fast algorithm for its computation and requires few 
overhead bits , it is the optimum decorrelating transform. Overhead bits needed is of 
the order of 0.0003 bits per pixel which is negligible in comparison with the transmit- 
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1* igure 5.8: (correlation coefficient matrix of eigen images of urban test image 


ting bits (0.09 bpp ) per pixel. Affine prediction may work better if prediction is done, 
not only with feature band alone but with all other bands. 


In the present work, test image sets used have only 4 bands. From the results re- 
ported in [24] , it is concluded that the spectral decorrelation efficiency increases with 
the no. of bands. Hence better compaction can be expected on using multispectral 
image data sets containing more no. of bands. 


5.3 Terrain adaptive test 

'Flu* terrain adaptive test was performed on these test images. Plot 3 shows the relative 
sixes of variance of each eigen plane of image set 1 for non-terrain adaptive and ter- 
rain adaptive approaches for different window size of 128, 64 and 32 respectively. The 
significant decrease in the variance as a result of terrain adaptation is observed, which 
indicates a greater compaction of data. 


From the fig (irngb) which shows the eigen images for the terrain adaptive approach 
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with a window size 64, it is observed that for the terrain adaptive approach, the eigen 
images appear to have discontinuities over the edges of the selected covariance update 
window. I he blocking effect clearly indicates the adaptation of the spectral transfor- 
mation pro< ess to the characteristics of the terrain within the selected window. 

I able 5..} gives the rate vs. distortion results when terrain adaptive approach with 
a window size of 64 is used. In plot 4 a comparison of mean square error vs. bpp of 
non-terrain adaptive and terrain adaptive approaches is made when each of them is 
followed by embedded zero tree coding of wavelet coefficients. It is observed that ter- 
rain adaptive approach does not improve the performance (mse vs. bpp). Fig. (img6) 
shows the reconstructed images of test image set 2 at 80:1 CR ( i.e. 0.09 bpp). It is 
observed that blocking effect is visible even in the reconstructed image. 


5.4 Overhead in spectral decorrelation subsystem 

In the present work only linear quantization was used to map eigen planes to eigen 
image's as the dynamic range was low ( less than 127) . For the non-terrain adaptive 
approach overheat 1 bits B 0 k needed is obtained using results given in section 3.4 
N = No. of bands =4 
B = Image dimension = 512 


B oh 


40 + 8 N 2 

~ NB 2 

40 * 4 + 8 * 4 2 
4*512*512 


bpp 

bpp 


= 0. 000274 bpp 
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bpp 

CR 

mse 

PS NR 

0.7 

10 : 1 

3.52 

36.6 L 

0.44 

16 : 1 

4.37 

35.67 

0.32 

22 : 1 

5.15 

; 

; 

34.95 

0.18 

40 : 1 

; 

7.43 

33.36 

0.11 

62 : 1 

9.37 

32.35 

0.09 

. __ 

77 : 1 

10.22 

31.98 


Table 5.3: Rate vs. distortion results when Terrain adaptive KLT with window size 64 
is followed by zero trees in wavelet coefficients 
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It can bo soon that li 0 h is quite insignificant in comparison to bits per pixel tans- 
mitted. 

For terrain adaptive approach it is given by 

B„h = A 1 * overhead of non — terrain adaptive approach. 

when 

_ dimension of image 
dimension of window 

for window size 64 and image dimension 512 overhead comes to 0.017 bpp 
significant at very low bit rate i.e. 0.09 bpp. 

5.5 Non- terrain adaptive test 

The non- terrain adaptive compression algorithms have been simulated for two sets of 
test images. The performance measure used for the compression algorithms is rate 
vs. distortion variation. The performance of the different compression algorithms are 
given in tables 5.4, 5.5 and 5.6 respectively for test image set 1 and in tables 5.7, 5.8 
and 5.9 respectively for test image set 2. 

'Fable 5.4 gives the estimate of performance when KL transform is used with em- 
bedded zerotree root coding using wavelet transform coefficients. Table 5.5 shows the 
performance when KI.T is followed by embedded coding using arbitrary wavelet packet 
coefficients (i.e. linear subband decomposition when all the coefficients of lowest sub- 
bands are "kept") . Table 5.6 shows the performance when KLT is followed by zerotree 
of bestbasis coefficients of the wavelet packet library. Plots 5 and 6 compare the mean 
square error ami PSNll vs. bpp of these algorithms respectively . As the variation of 


(5.5) 

(5.6) 
. It is quite 
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mse and PSNR vs. bpp respectively for each of the above schemes is very small , it 
can be concluded that the performance of algorithms are comparable. 

Visual evaluation of the reconstructed image set 1 (semi-urban) shown in fig (img 
7) at compression ratio 100:1 (i.e. 0.07 bpp) reveals that a visually lossless performance 
can be achieved upto this CR.. 

I‘or image set 2 (i.e. urban image) table 5.7 summarises the performance of com- 
pression when KL transform is followed by embedded zerotree coding using wavelet 
transform coefficients. Table 5.8 summarises the performance when KLT is followed 
by embedded coding using arbitrary wavelet packet coefficients (i.e. linear subband 
decomposition when all the coefficients of lowest subbands are ’’kept”) and table 5.9 
shows the performance when KLT is followed by embedded zerotree coding of bestbasis 
coefficients of t he wavelet packet library. Plots 7 and 8 compare the mean square error 
and PSNR vs. bpp of these algorithms respectively. As the variation of mse and PSNR 
vs. bpp respectively for each of the above schemes is very small , it can be concluded 
that the perfon nance of algorithms are comparable. 

Figures (imgS) , (img!)) and (imglO) shows the reconstructed image for set 2 at 80:1 
('R , when each of the above mentioned compression algorithms are applied respec- 
tively. Visual evaluation shows that the images are visually lossless upto a compression 
ratio of 80:1 (i.e. 0.09 bpp). 


The comparison of these algorithms show that no significant improvement is ob- 
tained by using the best basis coefficients of wavelet packet library. This is so because 
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0.5 

0.44 16 : 1 1.99 39.08 

0.30 20:1 2.195 38.66 

0.3 23 : 1 2.38 38.31 

0.198 35:1 2.96 37.36 
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0.061 115:1 5.15 34.95 

0.054 130 : 1 5.432 34.72 

0.023 300 : l 7.27 33.53 


Table 5.4: Rate Vs. distortion results for semi-urban image when KLT/Zerotrees in 
wavelet coefficients are used 
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Table 5.5: Rate Vs. distortion results for semi-urban image when KLT/Zerotrees 
coelfidents of best basis of wavelet packet library are used 



0.549 12:1 1.67 39.83 

0.5 11:1 1.93 39.22 

0.36 20:1 2.135 38.78 

0.27 25.5:1 2.45 38.17 

0.168 42:1 3.23 36.97 

0.106 66 : l 3.96 36.09 

0.076 92 : 1 1.65 35.4 

0.015 152.5:1 5.713 31.5 

0.023 300 : 1 7.24 33.47 

Table 5.6: Rate Vs. distortion results for semi-urban image when KLT/Zerotrees in 
wavelet packet coefficients are used 
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Table 5.7: Rate Vs. distortion results for urban image when KLT/Zerotrees in wavelet 
coefficients are used 
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2,58 

38.29 

0.640 

10.9 : 1 

2.89 

37.46 

0.565 

12,1 : 1 

5.205 

37.02 

0.155 

16 : 1 

5 94 

36.11 

0.56 

19 : 1 

4.535 

35.51 

0.55 

i 

21 : 1 

1.77 

35.28 

0.25 

50 : 1 

6.17 

34.17 

0.1525 

15 : 1 

7,88 

33.1 

0.09 

77 : 1 

10.12 

32.02 

0.07 

100 : 1 

11. 1 

31.62 

0.058 

150 : 1 

14.15 

30.56 


Table 5.8: Rate Vs. distortion results for urban image when KLT/Zerotrees m best 

* « * * I likrart/ arp 
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bpp 

CR 

mse 

PSNR 

0.7 

10 : 1 

2.52 

58.04 

0.11 

Hi : 1 

•1.88 

55.18 

0.52 

22 : 1 

5.02 

54.54 

0.18 

10 : I 

7.07 

55.18 

0.11 

02 : I 

0.00 

52.22 

0.00 

77 : I 

10.2!) 

51.05 


Table 5.10: Hale vs. distortion results when KLT is followed by zerotrees in DCT 
coefficient s 

the zeiotrce coding approach is not adapting to the spatial extent of the coefficients 
[25) . St is only predicting the insignificant coefficients. The results also conclude that 
zerotrees for any arbitrary subband decomposition can be defined, it is however the 
most apptopiiafe coding scheme only when wavelet transform is used. 

To judge the improvement in performance of wavelet transform over DCT, embed- 
ded zerotree coding method was also applied on DCT coefficients of image set 2. Table 
5.10 shows the rate vs. distortion measurement and plot 8 gives mean square error 
vs. bpp for image set 2. It is seen that the variation of mse and PSNR in this case 
is almost equivalent to that when wavelet transform was used. However DCT has an 
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additional 

fig (imgl 1 
sion ratio 


drawback of producing blocking effects. Visual evaluation of the images in 
I shows that the blocking effect dominates after 40:1 (0.18 bpp ) compo- 
und the image is visually lossy. For the purpose of comparison, fig(imgl2) 


shows the toconstnictod image set 2 


at CR of 80:1, when zerotree coding uses DCT 


coefficients; blocking effects are dearly visible in the reconstructed images. 


5.6 Spectral Fidelity 

In the bandwidth compression of multispectral imagery , preservation of the spectral 
resolution across hands is as essential as preserving the spatial resolution within each 
band. This is necessary in order to preserve the spectral fidelity of the data , i.e. spec- 
tral signatures of the terrains. It is important that the spectral signatures should not 
be lost, during the compression process . 

To measure the spectral fidelity of multispectral imagery a useful tool is devised 
winch is based on correlation coefficient matrix. The spectral fidelity is depicted by 
t he measurement of the correlation coefficient matrix. Any deviation from the original 
correlat ion euetfh lent matrix indicates the loss in spectral fidelity, the loss of correlation 
coefficient matrix in urban image are shown in fig 5.9 . 

The loss of correlation coefficient matrix in semi-urban (image set 2) is shown in 

the fig 5.10. 

These experimental results indicate that the loss of spectral fidelity , as measured 
by the deviation from the original correlation coefficient matrix is insignificant. Even at 
the compression ratio of 80:1 for urban image the loss is no more than 0.07 i.e. only 7 
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Figure 5.0: l.u,s in ronelation coefficient matrix of image set 2 (a), when KLT is 
followed by /eiotiee,s in Wavelet coefficients at CR 16:1 (b). when KLT is followed by 
wavelet, tramfonu at CR SO: l (<:). when KLT is followed by wavelet packet coefficients 
at SO: l CR (d). when KLT is followed by best basis coefficients of wavelet packet 
library at SO: 1 ( 'R 

percent . I 'his loss for semi-urban image is no more than 0.13 at 100:1 CR. 

For terrain adaptive case, the spectral fidelity is also calculated. For image set 2 
(urban image), the loss in spectral fidelity for 80:1 compression ratio when embedded 
zerotrre coding of wavelet coefficients is used, is given in fig 5.11. 

The above calculation shows that loss in spectral fidelity is insignificant even at 
very low bit rate. This shows that the efficiency of algorithm is good even at very 
low bit rate or very high compression ratio. In terrain adaptive approach there is an 
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Figun' '>.10: boss in correlation coefficients of image set 1 (a), when KLT is fol- 
lowed l>y Kmbedded zerotree coding using wavelet coeficients at 16:1 CR (b). when 
KLT/xcrottees in wavelet, coefiicients is used at 100:1 CR 


improvement of 2 1 / in spectral fidelity between few bands. Comparing spectral fidelity 
for non tenain adaptive scheme (fig 5.9) and terrain adaptive scheme (fig 5.11), we 
observe that the improvement, in spectral fidelity exists between band 4 and band 1, 
hand 1 and band 2 K' band 4 and baud 3. 


5.7 Comparison of performance with other compression schemes 

In a ie« ent work [28} [1] [11) related to lossy compression of multispectral imagery it is 
mentioned that the mean residual and gain shaped VQ yield the best results with CR 
20:1. Nonline.u VQ was found to yield the best qualitative performance with a CR of 

22 : 1 . 


For a hieian hieal data compression system CR of 27:1 was obtained. Another sys- 
tem is presented (it)} in which the 3-1) data is first spectrally decorrelated by using 
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Fi^un* 5,11: Loss in correlation coefficient matrix of image set 2 at 80:1 CR when 
Terrain adaptive KLT window size 64 is followed by Embedded zerotree coding using 
wavelet coeiiieients 

K L t ranslonn an<l then spatially decorrelatcd the principal components using discrete 
wavelet (tattsfonn .It was reported that this yielded 40% improvement in compression. 

Recently, in [21] a compression system is described in which the KL transform is 
followed by JRF.d. A visually lossless reconstruction of a multispectral image set with 
H): 1 < 'R was at hieved. 

In the piesent wotk, where KL transform is followed by embedded zerotree cod- 
ing using wavelet t oeffieients and wavelet packet coefficients respectively gave visually 
lossless iet oust met ion for urban image at a CR of 80:1 (i.e. 0.09 bpp) . 

In this s< herue, encoder is fully adaptive as the model is initialized at each thresh- 
old for each of the dominant, and subordinate passes. There is no training of any kind 
and no ensemble statistics of images are used in any way. An interesting property of 
embedded coding is that when the encoding and decoding is terminated during middle 
of a pass or in the middle of scanning of a .subband there are no artifacts produced 
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that would indiratr where the termination occur. Another interesting property of the 
embedded ending is that because of the implicit global bit allocation even at extremely 
high CK of '.lob: l the image quality is poor but still recognizable. 



Chapter 6 


Conclusion 

In tins thesis ,ui in\<*st igat i»m of ,i ,'{ I) transform based technique to compress multi- 
spec! r.ii imagery has been carried out. The algorithm exploits the inherent spectral 
and spatial « oiielat ion present, in the image data sets. The compression technique 
essentially consists of two subsystems : The spectral decorrelation and the spatial 
decoi relation subsystem respe< lively. Karhunen-Loeve transform(KLT) used for spec- 
tral deeon elation of the bands was followed by an embedded zerotree coding of wavelet 
transfonii < uejfh leuts to achieve spatial decorrelation of each band. 

I he* spec hal and spatial modularity of the algorithm architecture allows EZW or 
Kl.r to be teplacvcl bv any alternate* spatial or spectral coding procedure. 

Superior pet formuuc e of the adopted compression scheme judged from the recon- 
structed image fidelities ranges from near lossless at about 10.T CR to visually 

lossy beginning at about St); 1 CR for two typical 4 band image data sets containing ur- 
bau/semiurban tei rains. Better results can be expected if test image sets with large no. 
of spectral bands were available. 1 he compression achieved using the present method 
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is better than other multispectral image compression schemes reported 1 m. literature. 
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